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(57) ABSTRACT 

A system is disclosed for improving the cfEiciency of data 
transactions by permitting the length of burst transactions to 
be modified based upon system performance. A bus interface 
unit monitors the response times of memory devices, and, if 
WAIT periods are required before the memory device 
responds, the bus interface unit increases the length of the 
burst. Preferably, the bus interface unit includes a table of 
historical response times of various memory ranges, and 
determines an optimal burst length for each memory range. 
When a data transaction is made to a particular memory 
location, the BlU accesses the table and asserts a BURST 
signal for a sufficient period of time to accomplish the 
optimal burst length. After the optimal burst length has been 
reached in ttie existing memory transaction, the BURST 
signal is de asserted to end the burst cycle. 
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SYSTEM FOR IMPLEMENTING AN the third cyde, and so on. Thus, where a normal transfer of 

ADAPTIVE BURST LENGTH FOR BURST four double words would take at least eigbt clock cycles, the 

MODE TRANSACTIONS OF A MEMORY BY burst mode permits four doublewords to be transferred in 

MONITORING RESPONSE TIMES FOR five clock cycles. Burst mode operation thereby accommo- 

DIFFERENT MEMORY REGIONS S dates relatively high data transfer rates, and significantly 

reduces the latency involved in a memory transfer. 

Despite the advantages of operating in burst mode, the 

1. Field of the Invention burst mode feature has certain limitations. One limitation 
The present invention generally relates to computer sys- inherent in burst mode transfers is that the burst mode length 

terns which include memory devices that are subject to read typically is fixed, and cannot be altered. In addition, the 

and write transactions by a central processing unit (CPU) or burst mode feature is not responsive to actual latency 

other system device. Still more particularly, the present conditions in the system. In the devices and busses which 

invention relates to a computer system implementation in burst mode transactions, the burst length typically is 

which data is transferred between memory and the CPU in fixed by the system designer. The optimal burst length value, 

bursts Still more particularly, the present invention relates to however, is dependent upon a number of factors that typi- 

a system in which the burst length of the data stream can be cally are not known during the design process, 

modified based upon a variety of criteria to improve the Consequently, a system designer does not have all of the 

efiBciency of the computer system. information necessary to make a fully informed decision 

2. Description of the Relevant Art regarding the optimal burst length for a particular memory 

device 

For most computer systems, the number of clock cycles 

required for a data access to a memory device depends upon SUMMARY OF THE INVENTION 
the component accessing the memory and the speed of the 

memory unit. Most of the memory devices in a computer The present invention solves the shortcomings and defi- 

system are slow relative to the clock speed of the central ^5 ciencies of the prior art by providing a computer system that 

processing unit (CPU). As a result, the CPU is forced to includes a bus interface unit (also referred to as "BIU") to 

enter wail states when seeking data from the slower memory orchestrate data accesses between memory devices and a 

devices. Because of the relative slowness of most memory central processing unit (CPU) core. Preferably, the BIU 

devices, the efficiency of the CPU can be severely compro- includes a register with a dedicated bit to indicate whether 

mised. As the operating speed of processors increases and as 3Q the adaptive burst mode feature is to be implemented by the 

new generations of processors evolve, it is advantageous to BIU. The bus interface unit preferably includes a table of 

minimize wait states in memory transactions to fully exploit historical data on the latencies experienced for different 

the capabilities of these new processors. memory regions, A second table also preferably is provided 

One technique which has been used and which has gained which indicates an optimal burst length for particular latency 
widespread acceptance in computer systems is the use of one 35 periods (which may be measured by WAIT states, or other 
or more high speed cache memory devices. Typically, the criteria). The optimal burst length for a latency period can be 
cache memory is placed intermediate the CPU and system fixed by the system designer, or can be modified during 
memory, and is used to store frequently used, or recently system operation by a programmer or through a self- 
used, data. While cache memory devices have reduced executing algorithm. The table of historical data preferably 
processor latency times in memory transactions, a problem 40 ^ accumulated by the bus interface unit based on observa- 
still exist with latency in memory transactions, especially for tions of signals appearing on the CPU local bus and system 
memory transactions to other memory sub-systems, such as bus. When a particular access then is routed through the bus 
the system memory. interface unit to a particular memory range, the BIU imple- 

Another technique which has been used to reduce pro- ^ents a burst mode transfer with a burst length specified in 

cesser latency in memory transactions is to increase the 45 look-up table. 

amount of information transferred in each memory access. The actual implementation of the burst transfer may be 

Protocols exist for bursting data streams under certain made through the use of a BURST control signal. According 

conditions in some systems, such as the PCI (Peripheral to this embodiment, the memory continues burst data to as 

Component Interconnect) bus. The PCI bus has a protocol long as the BURST line remains asserted by the BIU. When 

which permits daU to be transferred in a burst mode. 50 the BURST line is deasserted, the target memory unit 

The burst mode feature allows reads or writes to consecu- completes the burst transaction. The BIU determines how 

tive memory locations at high speed, via burst cycles. The long to assert the BURST signal based upon the look-up 

normal procedure for reading or writing from memory is that table. Alternatively, the length of the burst data transfer may 

the CPU in a first clock cycle generates the address signals be indicated by signaling between the BIU and target 

on the address bus, and then in the following clock cycle, 55 memory device prior to the time that the memory device 

data is transferred to or from system memory. Since the PCI drives out the desired data. In this embodiment, the BIU 

data bus, for example, is 32-bits wide, a total of four bytes could indicate to the memory device that a burst data transfer 

(each byte has 8 bits) of data can be read or written by the is desired, and the memory device could respond with the 

CPU for every two clock cycles. Each set of four bytes expected response time, from which the BIU determines the 

transferred on the data bus is referred to as a "double word." 60 optimal burst length. 

In burst mode, additional sequential double words may be The historical data to be monitored preferably includes 

transferred during subsequent clock cycles without interven- WAIT states, which define the length of time between the 

ing address phases. For example, a total of four double initiation of a memory transaction and the response of the 

words can be read into the CPU using only five clock cycles first data item from the targeted memory device. The BIU 

because only the starting address is sent out oq the address 65 then stores, as a running count of clock signals, the WAIT 

bus, and subsequenUy the first double word of data is read period. In the preferred embodiment, the BIU stores the 

during the second cycle, the next double word of data during WAIT periods on an address range basis. Alternatively, the 
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WAIT periods (or Other latency measurement) can be Stored The computer system 10 also preferably includes a 

on a component by component basis. The look-up table peripheral bus bridge 110 and a memory control unit 150, all 

preferably includes x bits to define the address range, y bits connected to the processor 100 via system bus 125 and local 

to define the latency period, and z bits to define the optimal bus interface 40. The peripheral bus bridge 110 provides an 

burst length. 5 interface between an external peripheral bus 120 and the 

T ♦ »u w;An> ^ TiTii 1 ,11, system bus 125 and orchestrates the transfer of data, address 

In addition to the WAIT periods, the BIU also may look ^ . i • i * *i. i_ ■ j 

, . . , ■ • .t- lu .1 c and control signals between these busses m accordance with 

at other cntena when assigning the optiinal burst length. For techniques. As shown in FIG. 1, memory devices 135 

example, the BIU can monitor the number of accesses to a ^^^^ peripheral bus 120. Other memory devices 

memory range withm a predetermmed penod as an indica- ^^^^^^ ^ 125, such as the 

tion of whether to increase the length of the data transfer lO ^^^^^^ ^^^^^ ^ controller 190. ITie cache 

Other catena which can be monitored by the BIU include ^ controller 190 Includes both a cache memory 

(1) the mteraal state of the CPU, such as pending load, fetch, ^^^^y j^^^ ^ accordance with conventional tech- 

and store requests; (2) historical data regarding previous niaues 

execution history; (3) ttie content of memory responses; and ^^^^^^ ^^^^ to FIG. 1. an external system memory 175 

(4) the current state of any CPU special mode bits. 15 c u^ ^ * * u *u u 

^ ' .7 r preferably couples to system bus 125 through memory 

BRIEF DESCRIPTION OF THE DRAWINGS controller 150. The memory control unit 150 of FIG. 1 

couples to the system bus 125 and to a memory bus 170 to 

For a more detailed description of the preferred embodi- control memory transaaions between system components 

ment of the present invention, reference will now be made 20 system memory 175. The system memory 175 typically 

to the accompanying drawings, wherein: includes banks of dynamic random access memory (DRAM) 

FIG. 1 deplete a functional block diagram of a computer circuits. The DRAM banks, according to normal convention, 

system constructed in accordance with an exemplary comprise the working memory of the integrated processor 

embodiment of the present- 1^®- The memory bus 170, which interconnects the DRAM 

no. 2 depicts a look-up table in the bus interface imit of 25 ""^^^ ^ ""e memory controller 150, includes memory 

FIG. 1, which is used by the bus interface unit to determine '"'f > ^ y"^?"* """f^i 'j?'?- 

a burst length for each memory address range; 1° accordance with the exemplary embodmient of FIG. 1. 

^ ° , , , , ' . . . the memory control umt 150 may also connect to a read only 

FIG, 3 shows a look-up table for use by the bus interface ^ ^^^^^ (^^j ^j^^^) 

unit m determimng burst length based upon latency; ^qM device may store the BIOS (basic input/ 

FIG. 4 depicts an exemplary register for the bus interface output system) instructions for the computer system. As one 

unit which functions to indicate whether an adaptive burst skilled in the art will understand, the BIOS ROM may be 

mode is to be implemented. located elsewhere in the computer system if desired. 

WhUe the invention is susceptible to various modifica- In its illustrated form, computer system 10 embodies a 

tions and allemative forms, specific embodiments thereof 35 single processor, single-cache architecture. It should be 

are shown by way of example in the drawings and will understood, however, that the present invention may be 

herein be described in detail. It should be understood, adapted to multi-processor and/or multi-cache systems. It is 

however, that the drawings and detailed description thereto further understood that a variety of other devices may be 

arc not intended to limit the invention to the particular form coupled to peripheral bus 120. The peripheral bus may 

disclosed, but on the contrary, the intention is to cover all comprise a PQ bus, an ISA bus, an EISA bus, or any other 

modifications, equivalents and alternatives falling within the standard bus. Peripheral memory device 135 may be iUus- 

spirit and scope of the present invention as defined by the trative of a variety of memory devices. Exemplary memory 

appended claims. devices include hard disk drives, floppy drives, and CD 

ROM units. Thus, according to normal convention, the 

DETAILED DESCRIPTION OF THE processor 100 couples to other peripheral computer compo- 

iiNVtmiuiN nents through one or more external buses, such as system 

T\iming now to the drawings, FIG. 1 is a block diagram 1^5, peripheral bus 120, and memory bus 170. Various 

of a general computer system 10 for implementing the peripheral devices may reside on these busses. These periph- 

present invention. The computer system 10, in accordance eral devices may include memory devices, network cards or 

with generally known conventions, includes a microproces- 50 other structures which could be the target of a read or write 

sor or "processor'' 100 which functions as the brains of the request by the CPU core 50 or some other system compo- 

computer system 10. Processor 100 preferably includes a nent. 

CPU core 50 coupled to a local bus 165. Although not shown The CPU core 50 is illustrative of, for example, a 

in FIG. 1, the processor 100 also may include a cache Pentium-compatible microprocessor, with reduced instrac- 

memory resident on local bus 165. CPU core 50 couples to 55 tion set computer (RISC) operations, such as the assignee's 

a system bus 125 via a local bus interface unit (BIU) 40. As "K-5" superscalar microprocessor. The CPU local bus 165 is 

shown in FIG. 1, a clock 45 may also connect to the local bus exemplary of a Pentium-compatible style local bus. The 

165, or alternatively, may be located on the system bus or CPU local bus 165 includes a set of data lines, a .set of 

some other peripheral bus. As one skilled in the art will address lines, and a set of control lines (not shown 

understand, any of the components of the processor 100, 60 individually). Alternatively, the CPU core 50 and CPU local 

such as clock 45, may be located externally from the bus 165 may support other instruction set operations, with- 

processor 100 without departing from the principles of the out departing from the principles of the present invention, 

present invention. Similarly, other components shown as Referring still to FIG. 1, the present invention preferably 

external to the processor 100 in FIG. 1 may be integrated as includes a cache memory and controller 190. The cache 

part of microprocessor 100. As will be understood by one 65 memory functions as an intermediate storage device to store 

skilled in the art, in such a situation the system bus 125 may recently accessed data, as long as that data is determined to 

form part of the CPU local bus 165. be cacheable. The cache controller includes address tag and 
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State infonBation. The address tag indicates a physical 
address in system memory 175 or in external memory (such 
as may be represented by peripheral memory device 135, for 
example) corresponding to each entry within cache the 
memory. In accordance with normal convention, each entry 5 
within the cache memory is capable of storing a line of data. 
The cache controller also preferably includes an address tag 
and state logic circuit that contains and manages the address 
tag and state information, and a comparator circuit for 
determining whether a cache hit has occurred. Although not 
shown, the cache controller may include other logical 
elements, including for example a snoop write-back circuit 
that controls the write-back of dirty data within the cache 
memory. It will be appreciated by those skilled in the art that 
the cache memory and controller may contain other addi- 
tional conventional circuitry to control well-known caching 
functions such as various read, write, update, invalidate, 
copy-back, and flush operations. Such circuitry may be 
implemented using a variety of specific circuit configura- 
tions. Examples of such specific circuit configurations may 20 
be found in a host of publications of the known prior art, 
including U.S. Pat. No. 5,091,875 issued to Rubinfeld on 
Feb. 25, 1992 and U.S. Pat. No, 5,091,876 issued to Sachs 
et al. on Feb, 25, 1992. 

In accordance with the preferred embodiment of the 25 
present invention, the BIU 40 couples to both the local bus 
165 and the system btis 125 for orchestrating the transfer of 
address, data and control signals between these respective 
busses. Referring now to FIGS. 1 and 4, the bus interface 
unit (BIU) 40 preferably includes a register 210 for indicat- 30 
ing whether the adaptive burst feature is enabled. As shown 
in FIG. 4, register 210 preferably includes a dedicated bit, 
marked as bit AB. If bit AB of register 210 is enabled, then 
the adaptive burst feature is implemented by the BIU. If bit 
AB is not enabled, a fixed length burst may be used. The 35 
status of bit AB may be set as part of the system 
initialization, or may be subsequently activated by a system 
programmer. Although FIG. 4 shows register 210 as an eight 
bit register, one skiUed in the art will understand that register 
210 may implemented with a register of any size. 40 

In accordance with the principles of the present invention, 
the BIU 40 monitors certain system parameters for the 
purpose of determining the optimal burst length for particu- 
lar memory ranges or devices. Referring now to FIGS. 1, 3 
and 4, the BIU preferably includes a pair of look-up tables 45 
225,250 for assigning optimal burst lengths for particular 
memory ranges or components. As one skilled in the art will 
understand tables 225 and 250 may be combined together in 
a single table if the BIU is adequately programmed to define 
a burst length for specific latency periods. This can be done 50 
using a formula, or algorithmic definition for burst length. 

Referring first to FIG. 3, look-up table 225 represents the 
optimal burst length for specific measured latency periods. 
It should be understood that the values in table 225 are 
merely intended to be exemplary, and should not be con- 55 
strued as limiting the invention to the values represented. 
Thus, according to the example shown in table 225, if a 
memory response to a CPU request requires the CPU to 
enter a WAIT period for 2-3 clock signals (which is gen- 
crated by clock 45 in FIG. 1), then table 225 indicates that 60 
the optimal burst length is 8 bytes. Similarly, if the WAIT 
period for response comprises 10 clock signals, the example 
shown in table 225 defines an optimal burst length of 64 
bytes. The values entered in table 225 may be fixed by a 
system designer, or may be variable. If the values in table 65 
225 are variable, they can be varied either by a programmer, 
or by the system itself, based upon an algorithmic definition. 
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Referring now to FIG. 4, table 250 represents the actual 
historical values determined by the BIU 40, or associated 
circuitry, representing the historical time for servicing 
memory requests to particular address ranges. Once again, it 
should be understand that the values depicted in table 250 
are meant only as an illustration. Table 250 may be format- 
ted into particular address ranges, with as much granularity 
as desired. Thus, for example, system memory may be 
defined on a page by page basis, or the entire contents may 
be defined as a single memory range. The BIU 40 monitors 
addresses within the defined address ranges, and stores 
information in column 2 indicative of the period of response. 
The value in column 2 may be based upon the most current 
access to that memory range, or may represent an average of 
previous accesses. In the preferred embodiment, the unit of 
measurement is the number of clock signals received from 
clock 45 between the initial memory request, and the begin- 
ning of the response from the memory unit. As will be 
apparent, other units of measurements may used, as well as 
other definitions of the period to measure. 

Once the latency value has been determined and stored in 
table 250, the BIU 40 (or associated circuitry) accesses the 
look-up table 225 to determine the optimal burst length 
based on the measure latency period. In the preferred 
embodiment, this value for burst length then is stored in 
column 3 of table 250. It should be understand that while 
"columns" of look-up table 250 are discussed, in the pre- 
ferred embodiment table 250 is implemented by a memory 
map, or by registers. Thus, each "column" in actuality 
comprises a predetermined number of bits dedicated to 
represent values. Thus, for example, eight bits may be 
dedicated to represent the optimal burst length (column 3 of 
table 250), thus providing 256 possible values for burst 
length. In the preferred embodiment, if no historical infor- 
mation is provided for the latency of a particular address 
range, then preferably a default value is used for the optimal 
burst length. 

The manner in which the system implements the adaptive 
burst mode feature, and the optimal burst length value of 
table 250 may vary. In the embodiment shown in FIG. 1, a 
BURST control line is provided by the BIU 40. In this 
exemplary embodiment, the BURST signal is asserted by the 
BIU 40 to indicate a burst transaction is desired. The 
responding memory device preferably responds by bursting 
data until the BURST control signal is deasserted by the BIU 
40. As an alternative, the burst length may be defined by 
signaling between the CPU and the target memory device 
prior to the response of the memory device. This signahng 
could be transmitted over existing control, address and data 
lines through the use of unique combinations of signals, or 
additional lines could be defined spedfically for this type of 
encoding of the burst length. Thus, in this alternative 
embodiment, the BIU 40 could signal the target device that 
a burst transaction is desired of x bytes. As yet another 
alternative, the BIU 40 could communicate to the target 
memory device that a burst transaction is desired, and the 
associated memory control unit (or bus bridge) would then 
be responsible for defining the optimal burst length, based 
on similar criteria monitored by the bus interface unit in the 
embodiment discussed above. Thus, the memory device 
woiild be responsible for monitoring the response times for 
various address ranges and providing that information to the 
CPU at the beginning of a cycle. In response, the BIU would 
then determine the optimal burst length, based on the 
concepts described in tables 225 and 250. 

As an alternative to the use of the tables depicted in FIGS. 
3 and 4, the elapsed time period from the start of a memory 
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cycle can be monitored by the BIU 40. If the elapsed time 
exceeded a historical or programmed limit, then the BIU 
would assume that the address resulted in a cache miss of the 
cache memory 190 (or any other cache in the system). In 
response to this assumption, the BIU would increase the 5 
burst length to amortize the higher overhead involved in the 
memory access over a larger number of bytes. As an 
alternative to measuring the elapsed time period, the BIU 
may be configured to monitor a cache miss signal from the 
cache memory 190. If the cache miss signal was received (as lO 
shown in FIG. 1), then the BIU 40 would increase the burst 
length. 

As another alternative, the nature of the program flow can 
be monitored. If, for example, an interrupt signal is gener- 
ated which results in a miss of the internal cache in the 
processor 100, then the BIU sets the burst length longer, 
anticipating that the memory accesses will be to a different 
location than those previously cached. 

The present invention also contemplates the possibility of 
changing the length of a burst during a burst transaction, if 
a memory request with a higher priority appears at the BIU 
40. Thus, in this embodiment, and referring again to FIG. 1, 
the BIU 40 includes a mechanism, such as a register, for 
assigning priority levels to requests from particular devices. 
If the BIU 40 is acting as a bus arbiter, and a request is made 
to a memory location on the system bus 125 by an external 
bus master, the BIU 40 preferably responds by asserting the 
BURST signal to implement an optimal burst length transfer. 
If during that transaction, the CPU core 50 makes a memory 
request, the BIU 40 may provide an early termination of the 
existing burst cycle by deasserting the BURST signal, thus 
enabling the BIU 40 to process the memory request of the 
CPU core 50. 

Numerous variations and modifications will become 
apparent to those skilled in the art once the above disclosure 
is fully appreciated. It is intended that the following claims 
be interpreted to embrace all such variations and modifica- 
tions. 

What is claimed is: ^„ 

1. A system for providing adaptive burst lengths of data, 
comprising: 

a CPU core coupled to a local bus; 

a bus interface unit coupled between said local bus and a 
system bus, wherein said bus interface unit is config- 4S 
ured to receive a memory request from the CPU core 
via the local bus and to perform a memory operation via 
said system bus in response to said memory request; 

a memory device coupled to said system bus, wherein said 
memory device is configured to respond to said 50 
memory operation; 

wherein said bus interface unit is configurable to: 
use a stored vahie maintained in the bus interface unit 
to control a burst length for the memory operation; 
measure a response time of said memory device during 
said memory operation; and 
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modify the stored value used to control the burst length 
for the memory operation dependent upon the 
response time maintain information correlating: (i) 
address ranges of the memory device, (ii) values 
indicative of responsive times of memory operations 
comprising memory addresses falling within the 
address ranges of the memory device, and (iii) burst 
lengths to be used during memory operations com- 
prising memory addresses falling within the address 
ranges of the memory device. 

2. The system as in claim 1, wherein said bus interface 
unit comprises a look-up table correlating: (i) values indica- 
tive of response times of memory operations comprising 
memory addresses failing within the address ranges of the 
memory device, and (ii) burst lengths to be used during 
memory operations comprising memory addresses falling 
within the address ranges of the memory device. 

3. The system as in claim 2, wherein the burst lengths 
within the look-up table are determined by a programmer, 

4. The system as in claim 2, wherein the memory device 
is coupled to a peripheral bus. 

5. The system as in claim 1, wherein said bus interface 
unit is configured to assert a control signal indicative of the 
burst length during the memory operation. 

6. The system as in claim 5, wherein the bus interface unit 
is configured to deassert the control signal to end the 
memory operation. 

7. The system as in claim 1, wherein the memory device 
comprises the system memory. 

8. The system as in claim 1, wherein the memory device 
comprises a cache memory. 

9. A method for selecting a burst length for a memory 
sub-system, sub-system having a plurality of memory loca- 
tions defining a plurality of contiguous ranges of addresses, 
the method comprising: 

determining a response time for each of the plurality of 
contiguous ranges of addresses of the memory sub- 
system during a memory operation; and 

selecting a burst length for each of the plurality of 
contiguous ranges of addresses of the memory sub- 
system dependent upon the response time for each of 
the plurality of contiguous ranges of addresses. 

10. The method as in claim 9, wherein a first burst length 
is selected if the response time is less than or equal to a 
predetermined value, and wherein a second burst length is 
selected if the response time is greater than the predeter- 
mined value, 

11. The method as in claim 9, wherein the response time 
contributes to an average response time for the memory 
sub -system, and wherein the average response time for the 
memory sub-system is used to select the burst length for the 
memory subsystem. 

4^ If * * * 
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